Skip to content

fix(security): implement ed25519 signature verification in nexus #537#587

Closed
CuriousHet wants to merge 8 commits intomicrosoft:mainfrom
CuriousHet:security/nexus-sig-verification-537
Closed

fix(security): implement ed25519 signature verification in nexus #537#587
CuriousHet wants to merge 8 commits intomicrosoft:mainfrom
CuriousHet:security/nexus-sig-verification-537

Conversation

@CuriousHet
Copy link
Copy Markdown

Description

This PR implements robust Ed25519 cryptographic signature verification for the Nexus module, addressing the security gaps and placeholder TODOs identified in issue #537. It ensures that only authorized agents can register or deregister themselves by verifying signatures against their public keys. Additionally, it adds signing support for escrow requests and modernizes datetime handling to comply with modern Python standards.

Key Changes:

  • Agent Registry Security: Added mandatory signature verification for register and deregister operations in AgentRegistry.
  • Cryptographic Utilities: Created modules/nexus/crypto.py for canonical payload generation and Ed25519 sign/verify logic.
  • Escrow Signing: Updated ProofOfOutcome to support and verify signatures for escrow creation.
  • Backward Compatibility: Implemented a "legacy cutover" for agents registered before 2025-01-01 to ensure the new security requirements don't break existing records.
  • Modernization: Replaced deprecated datetime.utcnow() with datetime.now(timezone.utc) across the nexus module, resolving 56 DeprecationWarnings for Python 3.12+.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Maintenance (dependency updates, CI/CD, refactoring)
  • Security fix

Package(s) Affected

  • agent-os-kernel
  • agent-mesh
  • agent-runtime
  • agent-sre
  • agent-governance
  • docs / root

Checklist

  • My code follows the project style guidelines (ruff check)
  • I have added tests that prove my fix/feature works
  • All new and existing tests pass (pytest)
  • I have updated documentation as needed
  • I have signed the Microsoft CLA

Related Issues

Closes #537

@github-actions
Copy link
Copy Markdown

Welcome to the Agent Governance Toolkit! Thanks for your first pull request.
Please ensure tests pass, code follows style (ruff check), and you have signed the CLA.
See our Contributing Guide.

@github-actions github-actions bot added documentation Improvements or additions to documentation tests size/L Large PR (< 500 lines) labels Mar 29, 2026
Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Pull Request Review: fix(security): implement ed25519 signature verification in nexus #537


🔴 CRITICAL: Security Issues

  1. Lack of Replay Protection:

    • The current implementation does not include a mechanism to prevent replay attacks. An attacker could reuse a valid signature to perform unauthorized actions (e.g., re-register or deregister an agent).
    • Recommendation: Introduce a nonce or timestamp validation in the signed payload to ensure that each signature is unique and tied to a specific operation.
  2. Insufficient Validation for Legacy Cutover:

    • The _is_unsigned_legacy_entry method relies solely on the registered_at timestamp to determine whether an entry is exempt from signature verification. This approach is vulnerable to manipulation if an attacker can modify the registered_at field.
    • Recommendation: Store legacy entries in a separate immutable ledger or database to prevent tampering. Alternatively, use cryptographic proofs to verify the integrity of legacy entries.
  3. Potential Key Parsing Vulnerability:

    • The parse_public_key function does not validate the length or structure of the base64-decoded key. This could lead to malformed keys being accepted.
    • Recommendation: Add strict validation for the key length and format after decoding.

🟡 WARNING: Potential Breaking Changes

  1. Escrow Signature Enforcement:

    • The addition of signature verification for escrow creation (create_escrow) introduces a breaking change for agents that do not provide a private key. While legacy mode is supported, this change may disrupt workflows for agents that rely on unsigned operations.
    • Recommendation: Clearly document this change and provide migration guidance for agents transitioning to signed operations.
  2. Agent Registry Signature Enforcement:

    • The mandatory signature verification for register and deregister operations may break compatibility with existing agents that do not support signing. Although legacy support is provided, the transition period should be clearly communicated.
    • Recommendation: Extend the transition period and provide tools to help agents update their registration records with signatures.

💡 Suggestions for Improvement

  1. Unit Test Coverage:

    • While the PR includes tests for the crypto module and escrow signing, it lacks comprehensive tests for the AgentRegistry class, particularly for the register and deregister methods.
    • Recommendation: Add tests to cover edge cases, such as:
      • Invalid signatures during registration and deregistration.
      • Legacy entries bypassing signature checks.
      • Malformed public keys or signatures.
  2. Error Messages:

    • The error messages in SignatureDecodeError and SignatureVerificationError could be more descriptive to aid debugging.
    • Recommendation: Include details about the expected format and the actual input in the error messages.
  3. Backward Compatibility:

    • The transition period for legacy agents is hardcoded to end on 2025-01-01. This may not be sufficient for all users.
    • Recommendation: Make the cutover date configurable via an environment variable or configuration file.
  4. Thread Safety:

    • The AgentRegistry class does not appear to be thread-safe. Concurrent calls to register or deregister could lead to race conditions.
    • Recommendation: Use locks or other synchronization mechanisms to ensure thread safety.
  5. OWASP Agentic Top 10 Compliance:

    • While the PR addresses authentication and integrity (A1), it does not address other OWASP Agentic Top 10 concerns, such as secure logging (A10) or data minimization (A4).
    • Recommendation: Ensure sensitive data (e.g., private keys) is never logged, and consider implementing additional security measures for data minimization.
  6. Documentation:

    • The updated README provides a high-level overview of the changes but lacks detailed instructions for generating and managing keys.
    • Recommendation: Add a dedicated section on key management, including how to generate keys, store them securely, and rotate them.

Summary of Actionable Feedback

  • 🔴 Implement replay protection using nonces or timestamps in signed payloads.
  • 🔴 Strengthen validation for legacy entries to prevent tampering.
  • 🔴 Add strict validation for public key parsing.
  • 🟡 Clearly document breaking changes and provide migration tools for legacy agents.
  • 💡 Add unit tests for edge cases in AgentRegistry and other affected modules.
  • 💡 Improve error messages for better debugging.
  • 💡 Make the legacy cutover date configurable.
  • 💡 Ensure thread safety in AgentRegistry.
  • 💡 Address additional OWASP Agentic Top 10 concerns, such as secure logging and data minimization.
  • 💡 Enhance documentation with key management best practices.

This PR introduces critical security improvements but requires additional safeguards and testing to ensure robustness and backward compatibility.

@github-actions
Copy link
Copy Markdown

🤖 AI Agent: contributor-guide — Welcome 🎉

Welcome 🎉

Hi @first-time-contributor! 👋 Welcome to the microsoft/agent-governance-toolkit community, and thank you for your contribution! We're thrilled to have you here, and we appreciate the time and effort you've put into this pull request. 🎉

What You Did Well ✅

  1. Comprehensive Implementation: You've done an excellent job implementing Ed25519 signature verification. The modular design of crypto.py is clean and well-structured, making it easy to maintain and extend in the future.
  2. Backward Compatibility: Your inclusion of a "legacy cutover" for agents registered before 2025-01-01 is a thoughtful touch that ensures a smooth transition without breaking existing functionality.
  3. Testing: You've added thorough test coverage for the new cryptographic utilities and updated tests for the registry and escrow modules. This is fantastic and shows your commitment to quality.
  4. Documentation: The updates to the README.md are clear and provide a great overview of the new functionality. The inclusion of examples is especially helpful for new users.

Suggestions for Improvement 🛠️

  1. Code Style and Linting:

    • While you mentioned running ruff check, there are a few areas where the code could be further aligned with our linting standards. For example:
      • In crypto.py, the canonical_payload function has a long docstring. Consider breaking it into shorter lines for better readability.
      • Some functions in crypto.py could benefit from type hints for all parameters and return types (e.g., sign_data has private_key as Optional[any], which could be more specific).
    • Please ensure your code passes ruff with our selected rules (E, F, W).
  2. Test Organization:

    • While your tests are well-written, they should be placed in the appropriate directory. For this project, tests for the nexus module should go in packages/nexus/tests/. Please move test_crypto.py and test_proof_of_outcome.py to this directory.
  3. Commit Messages:

    • We follow the Conventional Commits standard. Your PR title is great (fix(security): implement ed25519 signature verification in nexus), but the individual commit messages should also follow this format. For example:
      • feat(nexus): add Ed25519 signature verification
      • test(nexus): add tests for crypto utilities
      • fix(nexus): replace datetime.utcnow with datetime.now(timezone.utc)
    • Please consider squashing and rewording your commits to align with this convention.
  4. Security Considerations:

    • Since this PR introduces security-sensitive code, we need to carefully review the cryptographic implementation. While your code looks solid at first glance, we'll need to conduct a deeper review to ensure there are no vulnerabilities. For example:
      • Have you considered edge cases where the canonical_payload function might fail or produce inconsistent results?
      • Are there any potential risks with the legacy cutover mechanism? For instance, could it be exploited by malicious actors?
    • If you haven't already, please review our CONTRIBUTING.md for security guidelines.

Next Steps 🚀

  1. Address Feedback: Please review the suggestions above and make any necessary changes to your PR.
  2. Run Linting: Ensure your code passes ruff with our selected rules (E, F, W).
  3. Update Tests: Move your test files to the appropriate directory (packages/nexus/tests/) and ensure all tests pass (pytest).
  4. Update Commit Messages: Reword your commit messages to follow the Conventional Commits format.
  5. Push Changes: Once you've made the updates, push them to your branch. This will automatically update the pull request.

If you have any questions or need help with any of the feedback, feel free to ask! We're here to help you succeed. 😊

Thank you again for your contribution, and we look forward to collaborating with you! 🚀

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 29, 2026

🤖 AI Agent: security-scanner — Security Review of PR #537: Implement Ed25519 Signature Verification in Nexus

Security Review of PR #537: Implement Ed25519 Signature Verification in Nexus

This pull request introduces Ed25519 signature verification to the Nexus module, which is a critical security enhancement for ensuring the authenticity and integrity of agent operations. Below is a detailed security analysis of the changes made in this PR.


Findings

1. Prompt Injection Defense Bypass

No issues found. The PR does not involve any prompt handling or user-provided natural language input that could be exploited for prompt injection attacks.

Rating: 🔵 LOW


2. Policy Engine Circumvention

The PR introduces a mechanism to verify signatures for agent registration and deregistration, which is a critical step in enforcing policy compliance. However, there is a potential policy circumvention risk due to the _is_unsigned_legacy_entry method in registry.py. This method allows agents registered before 2025-01-01 to bypass signature verification. While this is intended for backward compatibility, it creates a potential attack vector where an attacker could exploit this "legacy" exemption to register or deregister agents without proper verification.

Attack Vector: An attacker could manipulate the registered_at timestamp of a manifest to make it appear as a legacy entry, thereby bypassing signature verification.

Suggested Fix:

  • Store the registered_at timestamp in a tamper-proof manner (e.g., signed or hashed) when the agent is first registered.
  • Ensure that the legacy bypass is only applied to agents that were genuinely registered before the cutover date and not to newly created or tampered manifests.

Rating: 🟠 HIGH


3. Trust Chain Weaknesses

The PR introduces Ed25519 signature verification, which is a robust cryptographic mechanism. However, there are a few potential weaknesses in the trust chain:

  • Public Key Format Validation: The parse_public_key function assumes that the public key is always in the format ed25519:<base64>. If an attacker provides a malformed key, it could lead to unexpected behavior.
  • Key Management: There is no mention of how public keys are securely distributed or managed. If an attacker can replace a legitimate public key with their own, they could impersonate an agent.

Attack Vector: An attacker could exploit weak key management practices to inject their own public key into the system, allowing them to bypass signature verification.

Suggested Fix:

  • Add stricter validation for the public key format and length in parse_public_key.
  • Implement a mechanism for securely distributing and pinning public keys, such as using a trusted certificate authority or a blockchain-based registry.

Rating: 🟠 HIGH


4. Credential Exposure

No issues found. The PR does not introduce any new mechanisms that could lead to credential exposure. Private keys are handled securely within the sign_data function and are not logged or exposed.

Rating: 🔵 LOW


5. Sandbox Escape

No issues found. The PR does not involve any changes to sandboxing or process isolation mechanisms.

Rating: 🔵 LOW


6. Deserialization Attacks

The canonical_payload function ensures that data is serialized in a deterministic and secure manner, which mitigates the risk of deserialization attacks. However, the use of json.dumps with default=str could potentially introduce issues if unexpected data types are passed.

Attack Vector: If an attacker can inject malicious data types into the payload, it could lead to unexpected behavior during serialization.

Suggested Fix:

  • Add stricter validation for the input data to canonical_payload to ensure it only contains serializable types.

Rating: 🟡 MEDIUM


7. Race Conditions

The PR does not introduce any new concurrency mechanisms that could lead to race conditions. However, the _is_unsigned_legacy_entry method could potentially be exploited in a Time-of-Check to Time-of-Use (TOCTOU) attack if an attacker can modify the registered_at timestamp between the check and the signature verification.

Attack Vector: An attacker could exploit a race condition to modify the registered_at timestamp of a manifest after it has been checked but before the signature verification step.

Suggested Fix:

  • Lock the manifest during the _is_unsigned_legacy_entry check and signature verification to prevent concurrent modifications.

Rating: 🟡 MEDIUM


8. Supply Chain

The PR introduces a new dependency on the cryptography library, which is a well-maintained and widely used library for cryptographic operations. However, the version specified (>=41.0.0) should be pinned to a specific version to prevent supply chain attacks.

Attack Vector: If a malicious actor compromises the cryptography package, they could introduce vulnerabilities or backdoors into the system.

Suggested Fix:

  • Pin the cryptography dependency to a specific version and regularly update it after thorough testing.
  • Consider using a dependency scanning tool to monitor for vulnerabilities in third-party libraries.

Rating: 🟡 MEDIUM


Summary of Findings

Issue Rating Description Suggested Fix
Prompt Injection Defense Bypass 🔵 LOW No prompt injection vulnerabilities identified. No action needed.
Policy Engine Circumvention 🟠 HIGH Legacy bypass for signature verification could be exploited. Use tamper-proof storage for registered_at and ensure legacy bypass applies only to valid entries.
Trust Chain Weaknesses 🟠 HIGH Potential issues with public key validation and management. Add stricter validation for public keys and implement secure key distribution/pinning mechanisms.
Credential Exposure 🔵 LOW No credentials are exposed in logs or error messages. No action needed.
Sandbox Escape 🔵 LOW No sandboxing mechanisms are affected. No action needed.
Deserialization Attacks 🟡 MEDIUM Potential risk from unexpected data types in canonical_payload. Add stricter validation for input data to canonical_payload.
Race Conditions 🟡 MEDIUM Potential TOCTOU issue in _is_unsigned_legacy_entry. Lock manifests during legacy check and signature verification.
Supply Chain 🟡 MEDIUM Dependency on cryptography is not pinned to a specific version. Pin cryptography to a specific version and monitor for vulnerabilities.

Final Recommendation

The PR introduces critical security improvements by implementing Ed25519 signature verification. However, the identified issues, particularly the legacy bypass and trust chain weaknesses, must be addressed to ensure the robustness of the security model. The PR should not be merged until the suggested fixes for the HIGH severity issues are implemented.

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Pull Request Review: fix(security): implement ed25519 signature verification in nexus #537


🔴 CRITICAL: Security Issues

  1. Signature Verification Bypass for Legacy Agents:

    • The _is_unsigned_legacy_entry method allows agents registered before 2025-01-01 to bypass signature verification. While backward compatibility is important, this creates a potential security loophole where malicious actors could exploit legacy entries to bypass cryptographic checks.
    • Recommendation: Introduce additional safeguards for legacy entries, such as requiring a secondary verification mechanism (e.g., a hashed secret or a migration process to enforce signatures for legacy agents).
  2. Unvalidated Public Key Format:

    • The parse_public_key function assumes the public key format is valid and only checks for the ed25519: prefix. This could lead to malformed or malicious keys being accepted.
    • Recommendation: Add stricter validation for the base64-encoded key, such as checking the length and ensuring it conforms to Ed25519 key specifications.
  3. Canonical Payload Exclusion Logic:

    • The canonical_payload function excludes certain fields (signature, requester_signature, etc.) from the payload. If these exclusions are not consistently applied across all modules, it could lead to mismatched payloads during verification.
    • Recommendation: Centralize the exclusion logic and ensure all modules use the same canonicalization rules.
  4. Escrow Signing Logic:

    • In create_escrow, the fallback to legacy signatures (sig_{requester_did}_{task_hash[:8]}) is insecure and could be easily spoofed.
    • Recommendation: Deprecate the legacy signature format and enforce cryptographic signing for all escrow requests. Alternatively, introduce a migration plan for legacy agents.

🟡 WARNING: Potential Breaking Changes

  1. Backward Compatibility for Legacy Agents:

    • The introduction of mandatory signature verification for register and deregister operations may break existing workflows for agents registered before 2025-01-01.
    • Recommendation: Clearly document the transition plan and provide tools for agents to update their manifests with valid signatures.
  2. Datetime Modernization:

    • Replacing datetime.utcnow() with datetime.now(timezone.utc) may cause subtle changes in behavior for code relying on naive datetime objects.
    • Recommendation: Audit all downstream consumers of datetime fields to ensure they handle timezone-aware datetimes correctly.

💡 Suggestions for Improvement

  1. Thread Safety:

    • The AgentRegistry and EscrowManager classes appear to handle concurrent operations (register, deregister, create_escrow). Ensure thread safety by using locks or atomic operations where necessary.
    • Recommendation: Add tests simulating concurrent access to these methods to verify thread safety.
  2. OWASP Agentic Top 10 Compliance:

    • While Ed25519 signature verification improves trust and identity, consider additional measures such as rate limiting for register and deregister operations to prevent abuse.
    • Recommendation: Implement rate limiting and logging for sensitive operations.
  3. Type Safety:

    • The crypto.py module uses Union[dict, BaseModel] for payloads. This could lead to runtime errors if an unsupported type is passed.
    • Recommendation: Use stricter type annotations or enforce type checks at runtime.
  4. Testing Coverage:

    • The new tests for crypto.py and ProofOfOutcome provide good coverage, but additional edge cases should be tested:
      • Invalid or malformed public keys.
      • Payloads with deeply nested structures.
      • Concurrent signature verification requests.
    • Recommendation: Expand test coverage to include these scenarios.
  5. Documentation:

    • The updated README provides a good overview of the changes but lacks details on how to migrate legacy agents to the new signature-based system.
    • Recommendation: Add a migration guide and examples for updating legacy manifests.

Summary

This PR significantly improves the security of the Nexus module by implementing Ed25519 signature verification. However, the legacy agent handling introduces critical security risks that must be addressed. Additionally, care should be taken to ensure backward compatibility and thread safety. With these adjustments, the changes will align better with the project's security and reliability goals.

Action Items:

  1. Address the critical security issues related to legacy agent handling and public key validation.
  2. Document the migration plan for legacy agents.
  3. Expand test coverage for edge cases and concurrency.
  4. Audit datetime changes for potential breaking behavior.

Let me know if you need further clarification or assistance!

@imran-siddique imran-siddique enabled auto-merge (squash) March 29, 2026 20:34
Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Review Summary

This pull request introduces significant improvements to the security of the Nexus module by implementing Ed25519 signature verification for agent registration and deregistration processes. The changes also include modernization of datetime handling and backward compatibility measures for legacy agents. Below are the detailed observations and recommendations based on the focus areas outlined for the review.

Security Review

  1. Policy Engine Correctness:

    • The implementation of Ed25519 signature verification is a strong addition that mitigates the risk of unauthorized agent registration. Ensure that the verification logic is thoroughly tested to prevent any potential bypasses.
  2. Trust/Identity:

    • The use of Ed25519 for signing and verifying agent identities is appropriate. Ensure that the private keys are handled securely and are not exposed in logs or error messages. Consider implementing key rotation policies.
  3. Sandbox Escape Vectors:

    • Review the new crypto.py module for potential vulnerabilities, especially in how it handles input data for signature verification. Ensure that all inputs are sanitized and validated.
  4. Thread Safety:

    • The changes do not appear to introduce any threading issues, but ensure that any shared state (if any) is properly synchronized when accessed by multiple threads.
  5. OWASP Agentic Top 10 Compliance:

    • The implementation appears to address several OWASP concerns, particularly around authentication and data integrity. Ensure that the documentation clearly outlines how these security measures are implemented.

Type Safety and Pydantic Model Validation

  • The use of Pydantic models for data validation is commendable. Ensure that all models are thoroughly tested, especially those that interact with the new signature verification logic. Consider adding more explicit validation for fields that are critical to security.

Backward Compatibility

🟡 WARNING: The "legacy cutover" mechanism allows agents registered before 2025-01-01 to bypass signature checks. While this is a necessary measure for backward compatibility, it may introduce security risks if legacy agents are not adequately monitored or if their behavior is not well understood. Consider implementing logging or monitoring for these legacy entries to track their usage and potential vulnerabilities.

Code Quality and Documentation

  • The code is well-structured and follows the project's style guidelines. The addition of tests for the new cryptographic functionalities is excellent. Ensure that the documentation is updated to reflect the new signature verification process, including examples and potential pitfalls.

Suggested Improvements

💡 SUGGESTION:

  • Consider adding a mechanism to log signature verification failures. This could help in identifying potential attacks or misconfigurations.
  • Implement unit tests specifically targeting edge cases, such as malformed signatures or unexpected data formats, to ensure robustness.

Conclusion

Overall, this pull request significantly enhances the security posture of the Nexus module by implementing Ed25519 signature verification. The changes are well-structured and include necessary backward compatibility measures. However, attention should be given to the implications of allowing legacy agents to bypass signature checks. Ensure thorough testing and monitoring to maintain security integrity.

imran-siddique
imran-siddique previously approved these changes Mar 30, 2026
Copy link
Copy Markdown
Member

@imran-siddique imran-siddique left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed: all mandatory checks pass per copilot-instructions.md.

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Pull Request Review: fix(security): implement ed25519 signature verification in nexus #537


🔴 CRITICAL: Security Issues

  1. Signature Verification Bypass for Legacy Agents:

    • The _is_unsigned_legacy_entry method allows agents registered before 2025-01-01 to bypass signature verification. While this ensures backward compatibility, it introduces a security risk where malicious actors could exploit this loophole by spoofing legacy entries.
    • Recommendation: Implement additional safeguards for legacy entries, such as verifying their registration timestamps against a secure audit log or requiring a transitional signature mechanism.
  2. Canonical Payload Exclusion Logic:

    • The canonical_payload function excludes certain fields (signature, requester_signature) from the payload before signing. However, this exclusion logic is hardcoded and could lead to unintended omissions if new fields are added in the future.
    • Recommendation: Use a configurable list of excluded fields or validate excluded fields against a schema to ensure consistency.
  3. Key Parsing Error Handling:

    • The parse_public_key function raises a generic ValueError when key parsing fails. This could lead to insufficient error context for debugging or logging.
    • Recommendation: Use a more specific exception type (e.g., PublicKeyParseError) and include detailed error messages for better traceability.

🟡 WARNING: Potential Breaking Changes

  1. Legacy Cutover Date:

    • The introduction of a hardcoded legacy cutover date (2025-01-01) could lead to breaking changes for agents registered before this date. If the cutover date changes or is misconfigured, it could inadvertently block legitimate legacy agents.
    • Recommendation: Make the cutover date configurable via environment variables or a configuration file.
  2. Datetime Modernization:

    • The replacement of datetime.utcnow() with datetime.now(timezone.utc) resolves deprecation warnings but may cause subtle behavior changes in systems that rely on naive datetime objects.
    • Recommendation: Ensure all downstream systems and libraries are compatible with timezone-aware datetime objects.

💡 Suggestions for Improvement

  1. Thread Safety:

    • The AgentRegistry class does not appear to use locks or other synchronization mechanisms for concurrent access to shared resources like _manifests and _manifest_hashes.
    • Recommendation: Add thread-safe mechanisms (e.g., asyncio.Lock) to ensure data integrity during concurrent operations.
  2. Test Coverage:

    • While the PR includes tests for cryptographic utilities and escrow signing, there is limited coverage for edge cases, such as malformed payloads, invalid legacy entries, or concurrent registration/deregistration scenarios.
    • Recommendation: Add tests for:
      • Concurrent access to AgentRegistry.
      • Malformed payloads during signature verification.
      • Legacy entry registration and deregistration edge cases.
  3. Documentation:

    • The updated README.md provides a good overview of the changes but lacks detailed examples for handling legacy agents and using the new cryptographic utilities.
    • Recommendation: Expand the documentation to include:
      • A guide for transitioning legacy agents to the new signature-based system.
      • Examples of error handling for signature verification failures.
  4. OWASP Agentic Top 10 Compliance:

    • The PR addresses several OWASP Agentic Top 10 concerns, such as identity spoofing and data integrity. However, it does not address potential replay attacks.
    • Recommendation: Implement nonce-based or timestamp-based replay protection for signed payloads.
  5. Type Safety:

    • The crypto.py module uses Union[dict, BaseModel] for payloads, which could lead to type ambiguity in larger systems.
    • Recommendation: Standardize payload types by enforcing the use of Pydantic models wherever possible.

Summary

This PR introduces critical security improvements to the Nexus module by implementing Ed25519 signature verification. While the changes significantly enhance the security posture of the system, there are some critical issues related to legacy agent handling and canonical payload generation that need to be addressed. Additionally, potential breaking changes and areas for improvement in thread safety, test coverage, and documentation should be considered.


Action Items

  1. 🔴 Address the legacy agent signature bypass risk.
  2. 🔴 Improve canonical payload exclusion logic to prevent unintended omissions.
  3. 🟡 Make the legacy cutover date configurable.
  4. 💡 Add thread-safe mechanisms to AgentRegistry.
  5. 💡 Expand test coverage for edge cases and concurrency.
  6. 💡 Enhance documentation with detailed guides and examples.
  7. 💡 Implement replay protection mechanisms for signed payloads.
  8. 💡 Standardize payload types to enforce type safety.

Let me know if you need further clarification or assistance!

auto-merge was automatically disabled March 30, 2026 05:13

Head branch was pushed to by a user without write access

@github-actions github-actions bot added dependencies Pull requests that update a dependency file size/XL Extra large PR (500+ lines) labels Mar 30, 2026
Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Pull Request Review: fix(security): implement ed25519 signature verification in nexus #537


🔴 CRITICAL Issues

  1. Missing Replay Attack Mitigation:

    • The register and deregister methods in AgentRegistry do not include nonce or timestamp validation for the signed payload. This opens the system to replay attacks, where an attacker could reuse a valid signature to perform unauthorized operations.
    • Action: Add a nonce or timestamp field to the payload and validate it against a cache or database to ensure it hasn't been reused.
  2. Canonicalization Ambiguity:

    • The canonical_payload function excludes certain fields like signature and requester_signature. However, this exclusion is hardcoded and could lead to vulnerabilities if additional sensitive fields are added in the future without updating this function.
    • Action: Use a whitelist approach for fields that should be included in the canonical payload instead of a blacklist. This ensures future fields are not accidentally excluded.
  3. Legacy Cutover Logic:

    • The _is_unsigned_legacy_entry method allows bypassing signature verification for agents registered before 2025-01-01. This creates a security bypass for legacy agents, especially if the registered_at field is tampered with.
    • Action: Store a cryptographically secure hash of the original registration timestamp in the database to prevent tampering. Alternatively, enforce re-registration for legacy agents with proper signatures.

🟡 WARNING: Potential Breaking Changes

  1. Escrow Signature Requirement:

    • The addition of mandatory Ed25519 signatures for escrow creation (create_escrow) changes the behavior of the API. While legacy support is provided, this could still break integrations that rely on the old behavior.
    • Action: Clearly document this change in the release notes and provide migration guidance for users.
  2. Datetime Handling Update:

    • Replacing datetime.utcnow() with datetime.now(timezone.utc) is a breaking change for any downstream code that relies on naive datetime objects.
    • Action: Ensure all downstream modules and external integrations are updated to handle timezone-aware datetime objects.

💡 Suggestions for Improvement

  1. Thread Safety:

    • The AgentRegistry and EscrowManager classes do not appear to have explicit thread-safety mechanisms. If these classes are used in concurrent agent execution scenarios, race conditions could occur.
    • Action: Use locks (asyncio.Lock) or other concurrency primitives to ensure thread-safe operations on shared state.
  2. Enhanced Cryptographic Key Management:

    • The generate_keypair function is useful for testing but lacks integration with secure key storage mechanisms (e.g., HSM, KMS).
    • Action: Add support for secure key storage and retrieval for production environments.
  3. Improved Error Handling:

    • The SignatureError hierarchy is well-structured, but the error messages could include more context (e.g., the offending payload or key).
    • Action: Enhance error messages to include actionable debugging information.
  4. OWASP Agentic Top 10 Compliance:

    • While the PR addresses authentication and integrity, it does not address authorization. For example, the register method does not verify whether the agent has permission to register under the specified owner_id.
    • Action: Implement authorization checks to ensure agents cannot impersonate other owners.
  5. Test Coverage:

    • While the test suite covers basic signing and verification scenarios, it lacks tests for edge cases like:
      • Large payloads.
      • Invalid JSON structures in canonical_payload.
      • Replay attacks (if mitigation is added).
    • Action: Add tests for these edge cases to ensure robustness.

Summary of Changes Required

Security Fixes

  • Implement nonce/timestamp validation to prevent replay attacks.
  • Switch canonical_payload to a whitelist approach.
  • Secure legacy cutover logic to prevent timestamp tampering.

Breaking Changes Mitigation

  • Document changes to escrow signing and datetime handling.
  • Provide migration guidance for legacy agents.

General Improvements

  • Add thread safety mechanisms.
  • Enhance cryptographic key management.
  • Improve error messages and test coverage.

Final Recommendation

This PR introduces critical security improvements but requires additional fixes to address replay attacks and legacy cutover vulnerabilities. Once these issues are resolved, the PR will significantly enhance the security posture of the Nexus module.

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Pull Request Review: fix(security): implement ed25519 signature verification in nexus #537


🔴 CRITICAL Issues

  1. Signature Verification Bypass for Legacy Agents:

    • The _is_unsigned_legacy_entry method allows agents registered before 2025-01-01 to bypass signature verification. This creates a potential attack vector where malicious actors could exploit legacy entries to spoof identities or perform unauthorized actions.
    • Recommendation: Introduce a mechanism to revalidate legacy entries periodically or enforce signature verification for all agents after a grace period. Consider notifying legacy agents to update their credentials before the cutover date.
  2. Lack of Replay Attack Mitigation:

    • The signature verification mechanism does not include a nonce or timestamp in the signed payload. This makes the system vulnerable to replay attacks, where an attacker reuses a valid signature to perform unauthorized actions.
    • Recommendation: Include a unique nonce or timestamp in the payload to ensure that each signature is valid only once or within a specific time window.
  3. Insufficient Error Logging:

    • The verify_signature method raises exceptions for signature verification failures but does not log the failure details. This could hinder debugging and monitoring of potential attacks.
    • Recommendation: Add structured logging for signature verification failures, including the public key, payload, and signature.

🟡 WARNING: Potential Breaking Changes

  1. Backward Compatibility for Legacy Agents:

    • While the PR introduces a "legacy cutover" mechanism, the transition plan for enforcing signature verification after 2025-01-01 is unclear. This could lead to breaking changes for legacy agents if they are not properly updated before the cutover date.
    • Recommendation: Clearly document the transition plan and provide tools or scripts for legacy agents to update their credentials and signatures.
  2. Changes to Escrow Signature Handling:

    • The create_escrow method now requires a private key for signing. This changes the API behavior for agents that previously relied on legacy signatures.
    • Recommendation: Ensure that all dependent modules and external integrations are updated to handle the new signature requirements.

💡 Suggestions for Improvement

  1. Thread Safety:

    • The AgentRegistry class uses in-memory dictionaries (self._manifests and self._manifest_hashes) for storing agent data. These are not thread-safe, which could lead to race conditions in concurrent agent execution.
    • Recommendation: Use thread-safe data structures (e.g., collections.defaultdict with locks) or consider integrating a thread-safe database for state management.
  2. OWASP Agentic Top 10 Compliance:

    • The PR addresses identity spoofing but does not address other OWASP Agentic Top 10 risks, such as secure storage of private keys and protection against sandbox escapes.
    • Recommendation: Ensure private keys are stored securely (e.g., using a hardware security module or encrypted storage). Conduct a thorough review of sandboxing mechanisms to prevent escape vectors.
  3. Type Safety and Pydantic Validation:

    • The canonical_payload function relies on manual exclusion of certain fields (e.g., signature, requester_signature). This approach is error-prone and could lead to accidental inclusion of sensitive fields.
    • Recommendation: Use Pydantic's exclude parameter more extensively and define a clear schema for signable payloads.
  4. Unit Test Coverage:

    • While the PR includes tests for cryptographic utilities and escrow signing, it lacks comprehensive tests for the AgentRegistry methods (register and deregister) and edge cases (e.g., invalid legacy entries).
    • Recommendation: Add unit tests for AgentRegistry to cover scenarios such as:
      • Registration with invalid signatures.
      • Deregistration of legacy agents.
      • Concurrent registration/deregistration operations.
  5. Documentation Updates:

    • The README file has been updated to reflect the new signature requirements, but it lacks detailed examples for legacy agents and the transition plan.
    • Recommendation: Expand the documentation to include:
      • A migration guide for legacy agents.
      • Examples of signing and verifying payloads.
      • Best practices for private key management.

Summary

The PR introduces robust Ed25519 signature verification, significantly improving the security of the Nexus module. However, critical issues related to legacy agent handling and replay attack mitigation must be addressed to ensure comprehensive security. Additionally, potential breaking changes and thread safety concerns should be resolved to maintain system stability and backward compatibility.

Action Items:

  • Address the critical issues related to legacy agent handling and replay attacks.
  • Implement thread-safe mechanisms for concurrent agent execution.
  • Expand unit test coverage and documentation.
  • Plan and communicate the transition strategy for legacy agents.

@CuriousHet
Copy link
Copy Markdown
Author

Hi @imran-siddique, I've pushed a fix to add the missing cryptography dependency to pyproject.toml and moved the test files to the correct location for CI. Could you please re-approve the workflows? Thank you!

@imran-siddique imran-siddique enabled auto-merge (squash) March 30, 2026 06:08
Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Review Summary

This pull request introduces Ed25519 cryptographic signature verification to the Nexus module, addressing critical security gaps and placeholder implementations. The changes enhance the security of agent registration, escrow creation, and other operations by enforcing signature validation. Additionally, it ensures backward compatibility for legacy agents registered before 2025-01-01. The PR also modernizes datetime handling to align with Python 3.12+ standards.

Below is a detailed review of the changes:


🔴 CRITICAL Issues

  1. Signature Verification Bypass for Legacy Agents:

    • The _is_unsigned_legacy_entry method allows agents registered before 2025-01-01 to bypass signature verification. While this ensures backward compatibility, it creates a potential security loophole where malicious actors could exploit legacy entries to bypass verification.
    • Recommendation: Introduce additional checks (e.g., a whitelist of legacy agent DIDs or a migration plan to enforce signatures for all agents after a specific date).
  2. Lack of Replay Protection:

    • The signature verification logic does not include nonce or timestamp validation to prevent replay attacks. An attacker could reuse a valid signature to perform unauthorized actions.
    • Recommendation: Include a nonce or timestamp in the signed payload and verify its freshness during signature validation.
  3. No Cryptographic Key Rotation Mechanism:

    • The implementation does not provide a mechanism for agents to rotate their cryptographic keys. This could lead to security risks if a key is compromised.
    • Recommendation: Add support for key rotation, including updating the registry with new keys and invalidating old ones.

💡 Suggestions for Improvement

  1. Canonical Payload Documentation:

    • The canonical_payload function excludes certain fields (e.g., signature, requester_signature) from the payload. While this is necessary, it should be explicitly documented in the function docstring to avoid confusion.
    • Suggestion: Add detailed documentation specifying which fields are excluded and why.
  2. Unit Test Coverage:

    • While the PR includes tests for cryptographic utilities and escrow creation, it lacks tests for the register and deregister methods in AgentRegistry.
    • Suggestion: Add unit tests to verify signature validation during agent registration and deregistration.
  3. Error Handling Granularity:

    • The SignatureVerificationError does not provide detailed information about why verification failed (e.g., mismatched payload, invalid key). This could make debugging difficult.
    • Suggestion: Enhance the error message to include specific failure reasons.
  4. Backward Compatibility Documentation:

    • The backward compatibility logic for legacy agents is not well-documented in the README or code comments.
    • Suggestion: Add a dedicated section in the README explaining the legacy cutover and its implications.

🟡 Warnings (Potential Breaking Changes)

  1. Datetime Modernization:

    • Replacing datetime.utcnow() with datetime.now(timezone.utc) may cause subtle behavior changes in systems relying on naive datetime objects.
    • Recommendation: Ensure all downstream systems and libraries handle timezone-aware datetime objects correctly.
  2. Dependency Addition:

    • The addition of the cryptography library increases the dependency footprint. This could impact environments with strict dependency constraints.
    • Recommendation: Verify compatibility with all supported Python versions (3.9–3.12) and document the new dependency in the installation guide.

Code Quality Observations

  1. Thread Safety:

    • The AgentRegistry class does not appear to be thread-safe. Concurrent calls to register or deregister could lead to race conditions.
    • Suggestion: Use locks or other synchronization mechanisms to ensure thread safety.
  2. Type Safety:

    • The code uses type hints effectively, but some functions (e.g., sign_data) accept Optional[Any] for private_key, which is too generic.
    • Suggestion: Use stricter type hints (e.g., Optional[ed25519.Ed25519PrivateKey]) for better type safety.

OWASP Agentic Top 10 Compliance

  1. Identity Spoofing:

    • The introduction of Ed25519 signature verification significantly mitigates identity spoofing risks.
  2. Sandbox Escape:

    • No sandbox escape vectors were identified in this PR.
  3. Trust Manipulation:

    • The reputation system and escrow mechanisms are critical for trust management. Ensure that these systems are robust against manipulation (e.g., fake reputation scores).

Final Recommendations

  • Address the CRITICAL issues related to legacy agent verification, replay protection, and key rotation.
  • Implement the suggested improvements to enhance code quality, documentation, and test coverage.
  • Carefully evaluate the potential breaking changes and ensure smooth backward compatibility.

Once the critical issues are resolved and the suggestions are addressed, this PR will significantly improve the security posture of the Nexus module.

Copy link
Copy Markdown
Member

@imran-siddique imran-siddique left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great Ed25519 implementation! One critical security issue:

*Signature verification bypass via user-controlled
egistered_at*
(
egistry.py):
egister()\ calls _is_unsigned_legacy_entry(manifest)\ before overwriting
egistered_at. An attacker can set
egistered_at\ to a pre-cutover date to skip verification.

Fix: Move
egistered_at = datetime.now()\ before the legacy check, or clear the field first.

Also minor: \Optional[any]\ → \Optional[Any]\ in escrow.py.

Happy to re-review!

auto-merge was automatically disabled March 31, 2026 03:44

Head branch was pushed to by a user without write access

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Review Summary

This pull request introduces robust Ed25519 cryptographic signature verification to the Nexus module, addressing critical security concerns and modernizing datetime handling. The changes are well-structured and include comprehensive test coverage. However, there are some areas that require attention to ensure correctness, security, and backward compatibility.


🔴 CRITICAL Issues

  1. Signature Verification Bypass for Legacy Entries

    • The _is_unsigned_legacy_entry method allows agents registered before 2025-01-01 to bypass signature verification. While backward compatibility is important, this creates a potential security loophole where malicious actors could exploit legacy entries to bypass authentication.
    • Recommendation: Implement a stricter mechanism to validate legacy entries, such as requiring a migration process where legacy agents must update their records with valid signatures before the cutover date.
  2. Canonical Payload Exclusion Logic

    • The canonical_payload function excludes certain fields like nexus_signature, last_seen, and trust_score. However, the exclusion logic is hardcoded and may lead to unintended omissions if new fields are added in the future.
    • Recommendation: Use a configurable list of excluded fields or explicitly document the rationale for excluding each field. This ensures future developers understand the implications of modifying the exclusion list.
  3. Insufficient Validation of Public Key Format

    • The parse_public_key function only checks for the prefix ed25519: but does not validate the length or structure of the base64-encoded key. This could lead to runtime errors or security vulnerabilities if malformed keys are used.
    • Recommendation: Add stricter validation for the base64-encoded key length and format.

🟡 WARNING: Potential Breaking Changes

  1. Legacy Cutover Date

    • The introduction of a cutover date for legacy agents (2025-01-01) may cause issues for agents that fail to update their records before the deadline. This could lead to operational disruptions.
    • Recommendation: Provide clear documentation and communication to users about the cutover date and offer tools for migrating legacy entries to the new signature-based system.
  2. Datetime Modernization

    • Replacing datetime.utcnow() with datetime.now(timezone.utc) is a breaking change for any code relying on the old behavior. While this is a necessary update for Python 3.12+, it may affect downstream dependencies.
    • Recommendation: Document this change in release notes and provide guidance for users on updating their code.

💡 Suggestions for Improvement

  1. Error Messages

    • Improve error messages in SignatureDecodeError and SignatureVerificationError to include more context, such as the expected format or the specific reason for failure. This will aid debugging.
  2. Test Coverage

    • While the test coverage for cryptographic utilities is comprehensive, consider adding tests for edge cases in the _is_unsigned_legacy_entry method to ensure it behaves correctly under all scenarios.
  3. Documentation

    • Update the README to include a detailed explanation of the signature verification process, including examples of valid and invalid payloads. This will help users understand the security model.
  4. Performance Optimization

    • The canonical_payload function uses json.dumps with sort_keys=True, which can be computationally expensive for large payloads. Consider caching the canonical payload for frequently accessed data.

Final Assessment

The PR significantly improves the security posture of the Nexus module by introducing Ed25519 signature verification and modernizing datetime handling. However, the legacy entry bypass mechanism and some validation gaps pose critical risks that must be addressed before merging. Additionally, the backward compatibility changes should be clearly documented to avoid disruptions for existing users.

Action Required:

  • Address the 🔴 CRITICAL issues before merging.
  • Document the 🟡 WARNING changes in release notes.
  • Consider implementing the 💡 SUGGESTIONS for further improvement.

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Pull Request Review: fix(security): implement ed25519 signature verification in nexus #537


🔴 CRITICAL: Security Issues

  1. Legacy Cutover Bypass:

    • The _is_unsigned_legacy_entry function allows bypassing signature verification for agents registered before 2025-01-01. While backward compatibility is important, this creates a potential security loophole where malicious actors could exploit legacy entries to bypass signature checks.
    • Recommendation: Introduce a mechanism to revalidate legacy entries periodically or enforce stricter checks (e.g., requiring legacy agents to update their manifests with valid signatures).
  2. Canonical Payload Exclusion:

    • The canonical_payload function excludes certain fields like last_seen, trust_score, and registered_at from the payload. While this is intended to avoid signing metadata, it could lead to security bypasses if these fields are manipulated post-signature.
    • Recommendation: Document the rationale for excluding these fields and ensure that excluded fields cannot be exploited for security bypasses.
  3. Untrusted Key Parsing:

    • The parse_public_key function does not validate the format or size of the base64-decoded key beyond its Ed25519 compatibility. This could lead to potential denial-of-service attacks if malformed keys are passed.
    • Recommendation: Add stricter validation for key size and format before attempting to decode.

🟡 WARNING: Potential Breaking Changes

  1. Legacy Signature Handling:

    • The introduction of mandatory signature verification for register and deregister operations may break compatibility for agents that do not support Ed25519 signatures.
    • Recommendation: Provide clear migration guidelines for agents registered before 2025-01-01 and ensure that the transition is well-documented.
  2. Escrow Signing:

    • The addition of signing support for escrow requests introduces a dependency on private keys. Agents that do not have access to private keys may face issues.
    • Recommendation: Ensure that legacy agents can still operate without private keys during the transition period.

💡 Suggestions for Improvement

  1. Thread Safety:

    • The AgentRegistry class does not appear to handle concurrent access to _manifests and _manifest_hashes. This could lead to race conditions in multi-threaded environments.
    • Recommendation: Use thread-safe data structures (e.g., asyncio.Lock or collections.defaultdict) to manage concurrent access.
  2. OWASP Agentic Top 10 Compliance:

    • While Ed25519 signature verification addresses trust and identity concerns, additional measures like rate-limiting registration attempts and monitoring for anomalous behavior should be considered.
    • Recommendation: Implement rate-limiting and anomaly detection mechanisms for agent registration and deregistration.
  3. Type Safety:

    • The scak_validator parameter in ProofOfOutcome is typed as Optional[Any]. This weakens type safety and could lead to runtime errors.
    • Recommendation: Replace Any with a more specific type or protocol that defines the expected behavior of the validator.
  4. Testing Coverage:

    • While the tests for crypto.py are comprehensive, additional tests for edge cases (e.g., malformed manifests, legacy entries) should be added.
    • Recommendation: Add tests for legacy manifest handling and edge cases in signature verification.
  5. Documentation:

    • The README updates are helpful but lack details on how to handle legacy agents during the transition period.
    • Recommendation: Expand the documentation to include migration steps for legacy agents and examples of signature verification.

Summary

This PR significantly improves the security posture of the Nexus module by introducing robust Ed25519 signature verification. However, the legacy cutover mechanism and thread safety concerns need to be addressed to ensure comprehensive security and reliability. Additionally, potential breaking changes should be mitigated with clear migration paths and documentation.


Action Items

  1. 🔴 Address the legacy cutover bypass by introducing periodic revalidation or stricter checks.
  2. 🔴 Validate the format and size of public keys in parse_public_key.
  3. 🟡 Document migration steps for legacy agents and ensure backward compatibility.
  4. 💡 Improve thread safety in AgentRegistry and other shared state components.
  5. 💡 Expand testing coverage for edge cases and legacy handling.
  6. 💡 Enhance documentation with detailed migration guidelines and security considerations.

Let me know if you need further clarification or assistance!

@CuriousHet CuriousHet deleted the security/nexus-sig-verification-537 branch April 1, 2026 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file documentation Improvements or additions to documentation size/L Large PR (< 500 lines) size/XL Extra large PR (500+ lines) tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

security: implement signature verification in nexus registry

2 participants